Industrial Project - 234313
3D map evolution - PTC
Supervisors: Eldad Finkelstein, Mordecai Sayag
Students: Saja Yassin, Marwan Billan
Introduction
3D model is becoming more important and applicable than ever before. It is
being used for industrial, medical and academical purposes.
In large scale scenes, rescanning of the entire scene can be expensive and
time consuming.
Therefore, in this project we want to make the process more efficient, by
scanning the evolved portions only and fuse them to the 3D model base .
Goals
Main project objective:
The goal is to create an algorithm/code that decides by color images (of
the 3D model and from the scanning device - Hololens) if the real world
has changed and what kind of update is required.
Relevant Paper
We have found the following paper that discusses “Image Comparison
Methods & Tools” topics.
One of the discussed topic is “Feature Based Image Comparison”, which
gave us a basic direction to investigating the problem.
A link to the paper:
https://www.academia.edu/19962797/Image_comparison_Methods_and_Tools_A_Review?fbclid=IwA
R29xwjbNTpjcEVJNLA988qm7j1qqt23iv0yP6evFjyqvZOUXw3d_YJKICM
Synthetic image captured by Hololens
Real image captured by phone
Desired output
Input and output example
Our solution in details
1. Align the two images and resize them:
Run Dense-Sift on the two images .
Find matching feature points from the synthetic to the real image using Knn matcher (with k=2).
Filter the matched feature points using these methods:
For each two matched pointes discard the pair with big distance (the distance is defined by Knn matcher).
Find a Homography from the synthetic to the real image using Ransac, and then discard all the feature points
that dont match the Homography - where the following distance is bigger than some threshold:
The distance we measure is between the source feature points after applying the transformation and the
destination feature points.
Crop the synthetic image according to the generated Homography.
Resize the cropped image to fit the real image.
(According to the definition of the project, the synthetic image contains the real image.)
Real Image
Synthetic Image
Resized synthetic image
2. Find matches from the synthetic to the real image.
Run Dense-Sift on the two images to find feature points.
Find matching feature points from the synthetic to the real image using Knn matcher
(with k=2).
To optimize the matched feature pointes use the filter methods mentioned in section 1.
For each matched feature point sign the surrounding as unchanged area on the real
image.
Real image Synthetic image
3. Find matches from the real to the synthetic image.
Find the matching feature points the same as in section 2.
For each matched feature point sign the surrounding as unchanged area on the real
image.
Synthetic imageReal image
Colored image
The red portion is the
changed portion of the
scene.
Mask image
The black portion is the
changed portion of the
scene.
The white portion is the
unchanged portion of
the scene.
Our solution in pseudo
GetPerfectMatch(synthetic image, real image):
Synth, real synthetic image, real image
fp_synth, fp_real Dense-Sift(synth, real)
matches_array KnnMatches(fp_synth, fp_real)
filter matches_array:
homography_matrix, matches_mask findHomography(good_matches, fp_synth, fp_real, Ransac)
perfect_matches good_matches[matches_mask]
Return perfect_match, homography_matrix
good_matches = []
for m,n in matches_array:
if m.distance < gamma_goodness*n.distance:
good_matches.append([m])
Pseudo…
Main(synthetic image, real image):
Synth, real synthetic image, real image
perfect_matches, homography_matrix GetPerfectMatch(synth, real)
resized_synth Resize(Crop(homography_matrix(synth)), sizeOf(real))
perfect_matches1, _ GetPerfectMatch(resized_synth, real)
perfect_matches2, _ GetPerfectMatch(real, resized_synth)
Mask CreateMask( union(perfect_matches1, perfect_matches2))
Return Mask
Project management
We had weekly meetings with the mentor in order to:
Introduce us to the product we are going to enhance.
Define the specification of the project (input, output, dataset, etc...).
Guide us to necessary materials that can fill our knowledge gap .
Mentor our progress and give us academic advice.
Supply us with real dataset.
We (the students) had weekly meetings, in which we learned, experimented,
developed and tested our program.
Project management
Difficulties:
Filling the knowledge gap, especially that we didnt have organized resources, like we are
used to in other academic courses.
Working on real vs. real images and then adapt our solution to work on real vs.
synthetic images.
Experimental Methodology
Our program contains multiple parameters that can affect the output.
Therefore, we ran tests with different values of these parameters:
Step size: the density of the feature points used in part 2 or 3, (the resolution of the
solution).
ransac_reproj_threshold: the Ransac error threshold (see ransac_reproj_threshold
parameter in cv2.findHomography).
max_iters: the maximal number of iterations that Ransac will do before returning the
transformation (see maxIters in cv2.findHomography).
The main goal of the testing would be to evaluate how good and precise can
our program detect changed areas.
Experimental Methodology
We tested our program on 2 main datasets:
1. 3 databases of couples of real vs. real images capturing a parking during a full day:
All results with step size 15, 20, 25 and 30 where one comparison was done
between each two consequent images, while the other was done between the first
image of the DB and all the other images:
https://drive.google.com/file/d/11vINn1NhvEg4KfOqNFONhEs-
mhgkmRWn/view?usp=sharing
Time-lapse videos for setp size 15:
https://drive.google.com/drive/folders/1SbLk3D92ifLNieXlZ6ScGeCYJgvI2Kd
X?usp=sharing
Experimental Methodology
2. Couples of real vs. synthetic images taken from a phone and from the mesh
of the company respectively:
In order to see how the parameters affect the result, we ran the program on several
couples of real_synth images with different values of the parameters.
E.g., for step size in [10, 20, 50, 100], maxIters in [1, 1000, 2000], reprojectionThreshold
in [1,5,9]. Results:
https://drive.google.com/file/d/1ZLeew_xshb5ZiA7iqFlY5PgxG_rcCeXh/view?usp=
sharing
After analyzing the above examples, we decided that step size = 20, maxIters = 2000
and reprojectionThreshold= 9.0 give the best output for the given DB. Results:
https://drive.google.com/file/d/1lKHsroZuTOFB8PenIV1k0tUfAFuAR0Z5/view?us
p=sharing
In the following slides, we will present selected results.
The examined frame
kitchen
stepSize10_maxIters1_reprojection9
stepSize10_maxIters2000_reprojection1
aligned image (after step1) Result (the green indicate the unchanged)
stepSize10_maxIters2000_reprojection5
stepSize100_maxIters2000_reprojection9
aligned image (after step1) Result (the green indicate the unchanged)
stepSize20_maxIters2000_reprojection9
stepSize10_maxIters2000_reprojection9
aligned image (after step1) Result (the green indicate the unchanged)
Results and conclusions
Step size:
In the above results we can see that using large step sizes might cover huge areas when
the change is small, where using small step sizes might select small portions of a larger
change.
We observed in some of the experiments (especially in the real vs real), that the smaller
the step is, the more small changes will be detected and the changes will be marked
more accurately, but also the more we get false detected changes. (see an example of
real vs real results with different step sizes in the following slides).
The smaller the step size is, the larger the run time will be as expected - because small
step size means dense feature points which means large number of feature points,
therefore, the program will need more time to analyze, match and filter them.
The examined frame
From 100GOPRO DB
Frame:G0020080.JPG
Frame:G0020081.JPG
Step size 30
Step size 20
Step size 15 Step size 25
Results and conclusions
max_iters: when using larger max_iters, Ransac will make more iterations in
order to find the transformation, which will lead to more accurate and better
transformation. (we can see this result reflected in the kitchen examples).
However, in some cases (e.g., when the images are not good enough) setting
smaller values for max_iters may convey better results.
Results and conclusions
ransac_reproj_threshold: (range 0-10)
Ransac reprojection threshold, it indicate the distance threshold between the
source feature points after transformation and the destination feature points.
Therefore, the smaller the threshold is the more strict the transformation will
be, and the bigger the threshold is the more relaxed the transformation will
be.
In our experiments real vs real this parameter didnt make much difference,
but in real vs. synthetic, we had better results when we used big threshold.
Possible explanation: the synthetic objects, in the synthetic image, dont
represent a prefect copy of the real objects. So we need to add some
relaxation and use big threshold.
In the next slide you can see an example that demonstrates this situation.
stepSize10_maxIters10_reprojection1
stepSize10_maxIters10_reprojection9
aligned image (after step1) Result (the green indicate the unchanged)
Results and conclusions
In the first part of the algorithm (align the two images and resize them), in
order to find feature points we tried sift , dense sift with step size in [25 ,15,
10] , dense sift with step size 15 had the best results:
Fail to find Homography
Found a false Homography
Sift
0.066
0.46
Dense sift
- step size 25
0.46
0.166
Dense sift
step size 15
0.066
0.066
Dense sift
step size 10
0.0
0.25
If Ransac failed to find Homography, the program would return that the
whole frame has been changed.
If Ransac found false Homography, part 2 and 3 from the algorithm would
fail, or maybe worse, it would mark some areas as unchanged (false negative),
and since we prefer false positive over false negative, failing to find
Homography is better than finding false one.
Link for the results that contain the resized images:
https://drive.google.com/file/d/1eiGW-
ukk5jq9M69li1julPLgRHKj_XRL/view?usp=sharing
Note: in the result folders, a file with “None” suffix indicates failure to find
Homography.
Results and conclusions
Future Work
The best parameters values for the program depend on the
dataset, and can be very different form one dataset to another.
ML can be used to predict the best parametersvalues to a given
dataset.
Benchmarking over more variant datasets.
Summary
In this presentation we discussed our program that takes as input
a synthetic image (captured by Hololens) and a real image.
It decides if the real world has changed and what kind of update
is required to update the mesh.
Thank you very much
It has been fun